Response Time Reduction of Speech Recognizers Using Single Gaussians
نویسندگان
چکیده
In this paper, we propose a useful algorithm that can be applied to reduce the response time of speech recognizers based on HMM’s. In our algorithm, to reduce the response time, promising HMM states are selected by single Gaussians. In speech recognition, HMM state likelihoods are evaluated by the corresponding single Gaussians first, and then likelihoods by original full Gaussians are computed and replaced only for the HMM states having relatively large likelihoods. By doing so, we can reduce the pattern-matching time for speech recognition significantly without any noticeable loss of the recognition rate. In addition, we cluster the single Gaussians into groups by measuring the distance between Gaussians. Therefore, we can reduce the extra memory much more. In our 10,000 word Korean POI (point-of-interest) recognition task, our proposed algorithm shows 35.57% reduction of the response time in comparison with that of the baseline system at the cost of 10% degradation of the WER. key words: speech recognition, fast likelihood computation
منابع مشابه
The bucket box intersection (BBI) algorithm for fast approximative evaluation of diagonal mixture Gaussians
Today, most of the state-of-the-art speech recognizers are based on Hidden Markov modeling. Using semi-continuous or continuous density Hidden Markov Models, the computation of emission probabilities requires the evaluation of mixture Gaussian probability density functions. Since it is very expensive to evaluate all the Gaussians of the mixture density codebook, many recognizers only compute th...
متن کاملTied Mixtures in the Lincoln Robust CSR
HMM recognizers using either a single Gaussian or a Gaussian mixture per state have been shown to work fairly well for 1000-word vocabulary continuous speech recognition. However, the large number of Gaussians required to cover the entire English language makes these systems unwieldy for large vocabulary tasks. Tied mixtures offer a more compact way of representing the observation pdf's. We hav...
متن کاملDiscriminative feature weighting for HMM-based continuous speech recognizers
The Discriminative Feature Extraction (DFE) method provides an appropriate formalism for the design of the frontend feature extraction module in pattern classification systems. In the recent years, this formalism has been successfully applied to different speech recognition problems, like classification of vowels, classification of phonemes or isolated word recognition. The DFE formalism can be...
متن کاملTechniques for capturing temporal variations in speech signals with fixed-rate processing
Fixed-rate feature extraction which is used in most current speech recognizers is equivalent to sampling the feature trajectories at a uniform rate. Often this sampling rate is well below the Nyquist rate and thus leads to distortions in the sampled feature stream due to aliasing. In this paper we explore various techniques, ranging from simple cepstral and spectral smoothing to ltering and dat...
متن کاملMaximum-likelihood stochastic-transformation adaptation of hidden Markov models
The recognition accuracy in recent large vocabulary automatic speech recognition (ASR) systems is highly related to the existing mismatch between the training and testing sets. For example, dialect differences across the training and testing speakers result to a significant degradation in recognition performance. Some popular adaptation approaches improve the recognition performance of speech r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEICE Transactions
دوره 90-D شماره
صفحات -
تاریخ انتشار 2007